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Summary 



Freeman has considered the following two-stage procedure for finding a confidence 
interval for the treatment difference theta, using data from an AB/BA crossover 
trial. In the first stage, a preliminary test of the null hypothesis that the differential 
carryover is zero, is carried out. If this hypothesis is accepted then the confidence 
interval for theta is constructed assuming that the differential carryover is zero. 
If, on the other hand, this hypothesis is rejected then this confidence interval is 
constructed using only data from the first period. Freeman has shown that this con- 
fidence interval has minimum coverage probability far below nominal. He therefore 
concludes that this confidence interval should not be used. In the present paper, 
we analyse the performance of a similar two-stage procedure for an ABAB/BABA 
crossover trial. This trial differs in very significant ways from an AB/BA crossover 
trial, including the fact that for an ABAB/BABA crossover trial there is an unbiased 
estimator of the differential carryover that is unaffected by between-subject varia- 
tion. Despite these great differences, we arrive at the same conclusion as Freeman. 
Namely, that the confidence interval resulting from the two-stage procedure should 
not be used. 

Key words: crossover trials; differential carryover; preliminary hypothesis test; two- 
stage procedure. 
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1. Introduction 



Consider a two-treatment two-period crossover trial, with continuous responses. 
The purpose of this trial is to find a 1 — a confidence interval for the difference 9 in 
the effects of two treatments, labelled A and B. Subjects are randomly allocated to 
either group 1 or group 2. Subjects in group 1 receive treatment A in the first pe- 
riod and then receive treatment B in the second period. Subjects in group 2 receive 
treatment B in the first period and then receive treatment A in the second period. 
This trial is called an AB/BA trial. To deal with the possibility of non-zero differ- 
ential carryover, it was suggested (starting with Grizzle, 1965, 1974 and endorsed 
by Hills & Armitage, 1979 and Armitage & Hills, 1982) that the following two-stage 
procedure be used. In the first stage, a preliminary test of the null hypothesis that 
the differential carryover is zero (against the alternative that it is non-zero) is car- 
ried out. If this null hypothesis is accepted then the confidence interval for 6 is 
constructed to have nominal coverage 1 — a, assuming that there is no differential 
carryover. If, on the other hand, this null hypothesis is rejected then this confidence 
interval is constructed using only data from the first period (since this is unaffected 
by carryover). As pointed out by Freeman (1989), accepting this null hypothesis is 
not equivalent to concluding that the differential carryover is exactly zero. Freeman 
(1989) shows that the confidence interval interval resulting from this two-stage pro- 
cedure has minimum coverage probability far below 1 — a, demonstrating that this 
confidence interval should not be used. Senn (2006) states "In my opinion the most 
important paper on cross-over trials in the 25 years of Statistics in Medicine is Peter 
Freeman's paper" 

What is the performance of this type of two-stage procedure for other crossover 
designs? Jones & Kenward (2003, pp. 123-125) analyse the performance of this 
type of procedure for Balaam's design. This analysis makes the following two as- 
sumptions. The first assumption is that any carryover from a treatment in a given 
period is only into the next period, and not beyond ("first-order carryover" model). 
The second assumption is that the carryover from one period into the next period is 
determined only by the treatment applied in the first period and not the treatment 
applied in the second period. Thus, for example, according to this second assump- 
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tion the carryover from treatment A into the next period is the same, irrespective 
of whether the treatment in the next period is A or B. This assumption has rightly 
been criticized as being unreahstic by Fleiss (1986, 1989), Senn & Lambrou (1998) 
and Senn (2001, 2002, 2005). This severely limits the apphcability of the analysis 
of Jones & Kcnward (2003) of this type of procedure for Balaam's design. 

In the present paper we consider an ABAB/BABA crossover trial. Subjects 
are randomly allocated to either group 1 or group 2. Subjects in group 1 receive 
treatments A, B, A and B in the first, second, third and fourth periods respectively. 
Subjects in group 2 receive treatments B, A, B and A in the first, second, third and 
fourth periods respectively. We assume that any carryover from a treatment in a 
given period is only into the next period, and not beyond. However, our analysis 
of this trial does not require us to assume that the carryover from one period into 
the next period is determined only by the treatment applied in the first period and 
not the treatment applied in the second period. This is because we never need to 
consider the carryover of a treatment from one period into the next period for which 
the same treatment is applied. Two major differences between the AB/BA and 
ABAB/BABA trials are the following. For an ABAB/BABA trial: 

(i) There is an unbiased estimator (which is unaffected by differential carryover) 
of 9 that has the following properties. It is unaffected by the between-subject 
variation. Also, it is obtained without ignoring all of the data from periods 2, 
3 and 4. This is the estimator B described in Section 2. 

(ii) There is an unbiased estimator of the differential carryover that is unaffected 
by the between-subject variation. This is the estimator ^ described in Section 
2. 

There arc two arguments against the adoption of as the standard estimator of 
6. Firstly, as shown in Appendix A, this estimator is inefficient by comparison with 
the usual estimator of 9 based on data from a completely randomized design, using 
the same number of measurements of the response, unless a restrictive condition 
holds. Secondly, there is an estimator of 9, which we denote by A and describe 
in Section 2, that is much more efficient than 0, when the differential carryover is 
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zero. Wc view as the analogue for an ABAB/BABA design of the estimator of 9 
constructed using only data from the first period of an AB/BA design. 

To deal with the possibility of non-zero differential carryover, it is tempting to 
consider the use of the following two-stage procedure. In the first stage a prelimi- 
nary test of the null hypothesis that the difi^erential carryover is zero (against the 
alternative that it is non-zero) is carried out. If this null hypothesis is accepted 
then the confidence interval for 6 is constructed using the estimator A (described in 
Section 2) and having nominal coverage 1 — ct, assuming that there is no differential 
carryover. If, on the other hand, this null hypothesis is rejected then this confi- 
dence interval is constructed to have nominal coverage 1 — a, using the estimator 
6 (described in Section 2) that is based on data from all 4 periods. This two-stage 
procedure is described in detail in Section 2. 

A computationally-convenient formula for the coverage probability of the confi- 
dence interval that results from this procedure is presented in Section 2. In Section 
3 we numerically evaluate the coverage properties of this confidence interval. We 
show that this confidence interval has minimum coverage probability far below 1 — a, 
demonstrating that this confidence interval should not be used. The coverage prob- 
ability of this confidence interval depends only on the scaled differential carryover. 
This is in sharp contrast to the coverage probability of the confidence interval re- 
sulting from the two-stage procedure applied to data from an AB/BA trial, found 
by Freeman (1989), which depends on both the scaled differential carryover and the 
ratio (error variance)/ (subject variance). 

Beginning with the work of Freeman (1989), the literature on the effect of prelim- 
inary model selection (using, for example, hypothesis tests or minimizing a criterion 
such as AIC or Mallows's Cp) on confidence intervals has grown steadily. This lit- 
erature is reviewed by Kabaila (2009). It is commonly the case that preliminary 
model selection has a highly detrimental effect on the coverage probability of these 
confidence intervals. However, each case needs to be considered individually on its 
merits. 
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2. The two-stage analysis of ABAB/BABA trials under 

consideration 

We assume the following model for the ABAB/BABA trial. This model is sim- 
ilar to the model for an AB/BA crossover trial put forward by Grizzle (1965), as 
described by Grieve (1987). Let rii and n2 denote the number of subjects in group 
1 and group 2 respectively. Also let Yijk be the response of the jth subject in the 
ith group and the kth period {i = 1,2; j = 1, ... ,ni; k = 1, 2, 3, 4). The model is 

yijk = iJ' + iij + 7^k + 4>£ + \ + £ijk (1) 

where 

IJ, is the overall population mean 

^ij is the effect of the jth subject in the ith group 

TTjfc is the effect of the kth period 

(pt is the effect of the ith treatment 

\q is the residual effect of the gth treatment 

Eijk is the random error 
Note that both i and q are determined by the group i and the period k. This model 
is described in less abbreviated form in Appendix A. We assume that the ^ij and Sijk 
are independent and that the are identically iV(0, a^) distributed and the £ijk are 
identically A^(0, cr^) distributed, where cr^ > and cr^ > 0. Let m = (l/'^-i) + (l/n2). 

The parameter of interest is ^ = 0i — 02- The parameter describing the differ- 
ential carryover effect is = 3(Ai — A2)/4. Let Yi.k — (l/'T-i) X^jii ^ijifc (^ — 1,2; 
k — 1,2,3,4). We reduce that data to Di, D2, -D3, -D4, where Di — Yi.i — Y2.1, 
D2 = Yi.2 — Y2.2, -D3 = F1.3 — F2.3 and D4 = Y1.4 — Y2-4- The motivation for this data 
reduction is presented in Appendix A. Let 

A=^{D,-D2 + D2-D^). 

This is the usual estimator of 9, when it is assumed that -0 = (see e.g. Table I of 
Senn & Lambrou, 1998). Let 

e = D,-^D2- ^Ds - \d,. 
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This is an unbiased estimator (which is unaffected by differential carryover) of 9 that 
is unaffected by between-subject variation (cf Table I of Senn & Lambrou, 1998). 
We will use © as the estimator of 9 when it cannot be assumed that necessarily 
■0 = 0. We view © as the analogue for an ABAB/BABA design of the estimator of 9 
constructed using only data from the first period of an AB /B A design. As shown in 
Appendix A, is inefficient by comparison with the usual estimator of 9 based on 
data from a completely randomized trial, using the same number of measurements 
of response, unless > 4.5(7^. We will also make use of the following unbiased 
estimator of ■0: 

As shown in Appendix B, these statistics have the following distributions: A ~ 
N{9 - ij,mal/4:), © ~ iV(^, llmcrf/S) and i ~ N{ip,9ma'^/8). Note that when 
■0 = 0, A is a much more efficient estimator of 9 than 0. 

To deal with the possibility of non-zero differential carryover, it is tempting to 
consider the use of the following two-stage procedure. In the first stage a preliminary 
test of the null hypothesis that t/j — (against the alternative that -0 7^ 0) is carried 
out, using a test statistic based on ^. If this null hypothesis is accepted then the 
confidence interval for 9 is constructed using the estimator A and having nominal 
coverage 1 — a, assuming that -0 = 0. If, on the other hand, this null hypothesis is 
rejected then this confidence interval is constructed to have nominal coverage 1 — a, 
using the estimator 0. 

To analyse the properties of this two-stage procedure, we make the simplification 
that cr^ is known. Freeman (1989) makes the same simplification. So, in the first 
stage, we test the null hypothesis Hq : ip — against the alternative hypothesis 
ifi : 7^ using the test statistic ^/8/9m ^/a^. This test statistic has an A'"(0, 1) 
distribution under Hq. Define the quantile by the requirement that P{—Ca < Z < 
Ca) = 1 — a for Z ~ A^(0, 1). The following is a test of Hq against Hi, with level 
of significance ai. Accept Hq if \-\/8/9'm ^/cel < Cq^; otherwise reject Hq. In the 
second stage we proceed as follows. If Hq is accepted then we construct a confidence 
interval for 9, with nomimal coverage 1 — a, assuming that = 0. This confidence 
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interval is 



(2) 



If, on the other hand, Hq is rejected then we do not assume that ip = Q and we 
construct a confidence interval for 6*, with nomimal coverage 1 — a, based on G. 
This confidence interval is 



9 — CcA/llm/S (Te, 6 + CcjA/llm/8 o"£ 



(3) 



Let J denote the confidence interval for 9 that results from this two-stage proce- 



dure. Also let 7 = ^JS/'^mip/a^. As shown in Appendix A, the coverage probability 
of the confidence interval J is 



P{e eJ) = P{\H\ < c^-,)P{\X\ < c„) + P{\G\ < c„ \H\ > c„J, 



where 



G 
H 



N 



1 3/vTT 
3/Vn 1 



(4) 



(5) 



andX ~ A^(-37/V2, 1). Note that, for given «! and a, the coverage probability (jl]) 
is a function of the scaled differential carryover 7. The right-hand side of (jlj) is easily 
computed (using e.g. R or MATLAB programs), for each given 7. The last term on 
the right-hand side of (jlj) can be computed by evaluating the cumulative distribution 
function of the bivariate normal distribution Alternatively, this term can be 
computed by numerically evaluating the integral ffTTl) . derived in Appendix C. 



3. Numerical evaluation of the coverage probability as a 

function of 7 

Consider the two-stage procedure, for an AB/BA trial, based on a preliminary 
test with given level of significance and resulting in a confidence interval with a given 
nominal coverage. As shown by Freeman (1989), the actual coverage probability of 
this confidence interval depends on both the scaled differential carryover [Xy/nja 
in Freeman's notation) and p = a1/ (a^ + a1). For each different value of p, there is 
a different graph of this coverage probability as a function of the scaled differential 
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carryover. The larger the value of p, the smaller the minimum coverage probability 
of this confidence interval. 

Now consider the two-stage procedure described in the previous section, for an 
ABAB/BABA trial, based on a preliminary test with given level of significance and 
resulting in a confidence interval with a given nominal coverage. In sharp contrast 
to the AB/BA trial, the actual coverage probability (given by (j4j)) of this confidence 
interval depends only on the scaled differential carryover 7. This coverage is un- 
influenced by the between-subject variability (which is described by the parameter 
cr^). For level of significance ai = 0.1 of the preliminary test and nominal coverage 
1 — a = 0.95, this coverage probability as a function of 7 is shown in Figure 1. The 
minimum coverage probability of this confidence interval is 0.4711, showing that this 
confidence interval is completely inadequate. The minimum coverage probability of 
this confidence interval was computed for a wide range of values of ai and 1 — a. In 
every case, this confidence interval was found to have minimum coverage probability 
far below nominal, showing that it is completely inadequate. Note that for a given 
level of significance ai of the preliminary test and given nominal coverage 1 — a, 
the minimum coverage probability of this confidence interval does not depend on 
either of the sample sizes rii and n2. The only effect of an increase in rii and n2 is to 
change the scaling (via m = (l/ni) + (1/722)) of the differential carryover ip. Con- 
sequently, the harmful effect of preliminary hypothesis testing does not disappear 
with an increase in sample sizes ni and n2- 



9 



1 




0.4- 



-4 -2 2 4 

y 

Figure 1: Plot of the coverage probability of the confidence interval for 9, resulting 
from the two-stage procedure, against 7. This confidence interval has nominal cov- 
erage 1 — q; = 0.95. The preliminary hypothesis test has significance level ai = 0.1. 
The horizontal dashed line has vertical axis intercept 0.95. 



4. Conclusion 

For an ABAB/BABA trial, we have shown that the minimum coverage proba- 
bility of the confidence interval resulting from the two-stage procedure is far below 
the nominal coverage, showing that this confidence interval is completely inade- 
quate. Increasing the sample sizes ni and 77-2 does not improve the situation. Our 
conclusion is that this confidence interval should not be used. This is similar to 
the conclusion of Freeman (1989) for confidence intervals resulting from a two-stage 
procedure applied to an AB/BA trial. In other words, we provide further support 
for the rejection by Senn (2002, p. 12) of analyses of data from any two-treatment 
crossover trial based on a preliminary test of the null hypothesis that the differential 
carryover is zero. 
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Appendix A: The efficiency of by comparison with an 
estimator from a completely randomized trial 



In this appendix, we consider the efficiency of G by comparison with the usual es- 
timator of 9 based on data from a completely randomized trial. For an ABAB/BABA 
crossover trial, the total number of measurements of the response is 4(ni + ^2). We 
therefore compare G with the usual estimator of 9 based on data from a completely 
randomized trial, with 2(ni + ^2) randomly-chosen subjects in each group. 

Let Y^^, . . . , Ygfnj+nj) denote the responses of the 2{ni + ^2) subjects given 
treatment A. Also let Y^^, . . . , Ygfm+nj) denote the responses of the 2(ni + ^2) 
subjects given treatment B. Consistently with the model ([1]), we suppose that 
Y^, . . . , ^2fni+n2)' ^1^' • • • ' ^'i(nx+n2) independent random variables, where 
Y^, K,f„i+„2) are identically A^(^ + 0i, a^ + a^) distributed and Y^^, i^afm+na) 
are identically A^(;U + 02, o'e + erf) distributed. The usual estimator of 9 is 

® = 2(ni+n2) ^^'^ ^ ■ ■ ■ - 3(^1 + ^2) ^^"^ " 

This estimator has an N(^9, {cr'^ + a'^)/ {ni+n2)) distribution. Suppose, for simplicity, 
that m = n2 = n. Thus Var(Q) = (ct^ + (r^J/{2n) and Var(e) = 11^2/(4^). Thus 
Var(e) < Var(e) if and only if > 4.5 



Appendix B: Details for Section 2 

This appendix consists of 3 sections. In the ffist section, we carry out data 
reduction. In the second section, we derive the distributions of the statistics A, 6 
and In the third section, we derive the formula (jl]) for the coverage probability 
of the confidence interval J resulting from the two-stage procedure. 
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Data reduction 

It follows from the model ([T]) that 



iljl 


1 , \ C 1 ^ 1 

— A* + + TTl + 


^ 1 r- 

01 + 




1 , \ C \ TT 1 

— /i + + TC2 + 


(P2 + M+ £lj2 




= + ^Ij + VTs + 


01 + A2 + eij3 




= /i + ^ij + vr4 + 


02 + Ai + £1^4 


Y2jl 


= /i + 6j + TTl + 


02 + £2jl 


Y2j2 


= + 6j + + 


01 + A2 + £2^2 


Y2j3 


= + 6j + TTs + 


02 + Ai + £2^3 


Y2j4 


= /i + 6i + + 


01 + A2 + e2j4 



Let Yi.k = i^/ni) Yl^Li Yijk {i = 1,2; k = 1, 2, 3, 4). We first reduce the data to Fi.i, 
^^1.2, ?i.3, yi.4, ^"2.1, ^"2.2, ^"2.3 and F2.4. Note that 



Fl.l 


= /i + 6- 


+ vri + ( 


/>1 + £1.1 


>^1.2 


= + 6- 


+ vr2 + < 


/)2 + Ai + £1.2 


^1.3 


= + 6- 


+ VTs + < 


/)l + A2 + £1.3 


yi.4 


= + ^1- 


+ vr4 + < 


/>2 + Ai + £1.4 


1^2.1 


= + 6- 


+ 7ri + < 


h + £2-1 


^2-2 


= + 6. 


+ 7r2 + < 


Pl + X2 + £2-2 


^2-3 


= + 6- 


+ TTs + ( 


/)2 + Ai + £2-3 


^2-4 


= + 6- 


+ 7r4 + ( 


+ X2 + 62.4 



where ^j. = (l/nj) X^Jli 6i and ei.k = Ejli ^ijk- Note that ^1., ^2., ^i-i, • • • , 

£2-1, • • • ,^2-4 are independent, ^1. ~ A^(0, cr^/rii), ^2- ~ A^(0, 0-^/^2), ^m, • • • ,£1.4 are 
identically A^(0, o"^/ni) distributed and £2-1, • • • , £2-4 are identically A^(0, cr^/n2) dis- 
tributed. 

The only way to remove the influence of the parameters tti , . . . , 7r4 on the reduced 
data Fi.i, ^1.2, . . . , Y2.4 is to perform a further data reduction to Di, . . . , D4, where 
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D, = - F2.1, D2 = F1.2 - Y2.2, D3 = F1.3 - F2.3 and = F1.4 - ?2.4. Note that 





^1.- 


-6. 


+ + r/i 






-6. 


_ ^ + 1^ + ^2 


D^ = 




-6. 












£i-2- 


£^2-2, 


^3 


= £i-3 - £2-3 and 774 



r]i, . . . ,7]4^ are independent and identically A^(0, mcr^) distributed. 

Derivation of the distributions of the statistics A, & and ^ 

Note that 

A = e-tjj + -{r]i -r]2 + r]3- 774). 
Thus A ~ N{e - ip, mall 4:). Note that 



and that 



= ^ + r/i - ^7^2 - - ^r?4 



^ = ^ + ^(^1 - ^3)- 



(6) 



(7) 



It follows from ([6]) and ([8]) that (A, has a bivariate normal distribution and that 
Cov(y4, ^) = 0. Thus A and ^ are independent random variables. It follows from 
(HD and dH} that 



N 



mat 



11 9 
9 9 



(9) 



Derivation of the formula (4) for the coverage probability 

Define the event 

B = 



8 ^ 
9m cTf 



If this event occurs then J is equal to ([2]) and if i?^ occurs then J is equal to 
By the law of total probability, the coverage probability P{6 E J) is equal to 

P{B n{ee J}) + P[B^ n G J}) 
{A-e) [a 



p\ Bn 



P{B)P 



m 

{A-e) /4 



<ca> ] + p\B'n 



<ca \ + P[B''n 



11m 



{Q-9) 



Op 



11m 



< Cn 
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since A and ^ are independent random variables. Now define 



= H=J±1, G=(^,/^ and X='A^J±. 

V 9m (Te V 9m as <Je V Hm V m 

Thus, the coverage probability P{9 G J) is given by (jl]). Note that it follows from 
^ that the distribution of (G, is given by 

Appendix C: Alternative expression for 
P{\G\ < c^,\H\ > c«J 

In this appendix, we present an alternative expression for P(|G| < Ca,\H\ > c^i) 
that may be convenient for the computation of the coverage probability (jlj). By the 
law of total probability, 

P{\G\ < c,) = P{\G\ < c,, \H\ > c„J + P{\G\ < c«, \H\ < c,J, 

so that 

P{\G\ < c„, \H\ > c„J = P{\G\ < ca) - P{\G\ < c,, \H\ < 

= l-a- P{\G\ <Ca,\H\ <Ca,) 

since G ~ A^(0, 1). 

Let fG,H{g,h) denote the probability density function of {G,H), evaluated at 
{g,h). Also, let fH\G{h\g) denote the probability density function of H conditional 
on G = g, evaluated at h. Let denote the A^(0, 1) probability density function. 
Observe that 



P{\G\<Ca,\H\<Ca,)= / fG,Hi9,h)dhdg 

= r fH\G{h\g)dh(t>{g)dg (10) 

It follows from (|5]) that the distribution of H conditional on G = g is N (^jj,(g) , v) , 
where n{g) = 7 + (3(?/vTT) and v = 2/11. Thus (fTOj) is equal to 

/ h{ca,;M,v) -^{-Ca,;ij{g),v))(j){g)dg (11) 

J-Ca 

where $(a;;/i,f) denotes the N{fi,v) cumulative distribution function, evaluated at 
X. The integral ffTTl) is readily evaluated using the numerical quadrature functions 



available in either R or MATLAB. 
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